Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU]whisper readvalue optimize #26130

Merged

Conversation

xipingyan
Copy link
Contributor

@xipingyan xipingyan commented Aug 20, 2024

Details:

Tickets:

  • 128743

Profile each node execute time.
Support Static and Dynamic infer.

Signed-off-by: xipingya <[email protected]>
If reset is not called, these marked nodes also desn't need to be executed.

Signed-off-by: xipingya <[email protected]>
@xipingyan xipingyan requested a review from maxnick August 20, 2024 08:33
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: CPU OpenVINO CPU plugin category: transformations OpenVINO Runtime library - Transformations category: CPP API OpenVINO CPP API bindings labels Aug 20, 2024
@github-actions github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Sep 3, 2024
@github-actions github-actions bot removed category: Core OpenVINO Core (aka ngraph) category: CPP API OpenVINO CPP API bindings labels Sep 10, 2024
xipingyan and others added 3 commits December 9, 2024 19:11
…s/stateful_sdpa_fusion.cpp

Co-authored-by: Maksim Kutakov <[email protected]>
…ision.

So change convert's dst precision to i8.

Signed-off-by: xipingya <[email protected]>
Copy link
Contributor

@luo-cheng2021 luo-cheng2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

…e_optimize

# Conflicts:
#	src/plugins/intel_cpu/src/nodes/input.cpp
#	src/plugins/intel_cpu/src/transformations/cpu_opset/convert_to_cpu_specific_opset.hpp
…e_optimize

# Conflicts:
#	src/plugins/intel_cpu/src/graph_optimizer.cpp
#	src/plugins/intel_cpu/src/nodes/input.cpp
#	src/plugins/intel_cpu/src/nodes/memory.cpp
#	src/plugins/intel_cpu/src/nodes/memory.hpp
#	src/plugins/intel_cpu/src/transformations/cpu_opset/convert_to_cpu_specific_opset.hpp
@xipingyan xipingyan force-pushed the xp/whisper_readvalue_optimize branch from de098e3 to 9dbb1df Compare December 12, 2024 07:37
Copy link
Contributor

@maxnick maxnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@maxnick
Copy link
Contributor

maxnick commented Dec 13, 2024

Internal performance validation is required.

@maxnick
Copy link
Contributor

maxnick commented Dec 13, 2024

@dmitry-gorokhov , could you please take a look?

@maxnick
Copy link
Contributor

maxnick commented Dec 20, 2024

Internal performance validation passed.

@dmitry-gorokhov
Copy link
Contributor

General question: does this optimization supports Ini subgraps which fully or partially shared between several ReadValue ops (common scenarion for LLMs)? If not are there any ideas how proposed optimization will be extended for that cases?

@xipingyan
Copy link
Contributor Author

xipingyan commented Dec 24, 2024

General question: does this optimization supports Ini subgraps which fully or partially shared between several ReadValue ops (common scenarion for LLMs)? If not are there any ideas how proposed optimization will be extended for that cases?

No, about past K, V, I skipped it(https://github.com/openvinotoolkit/openvino/pull/26130/files#diff-c12bcfcf5456497adee38ad4362bde01528476b601fae102135a56810dbb70deR273-R279). because current PR is a bit complex, maybe we can postpone implement this function for https://github.com/xipingyan/openvino/blob/17f10b34f1c1a824a366ceed14c51b44163a1d50/src/plugins/intel_cpu/src/nodes/memory.hpp#L263.

I think the new extension will share or duplicate init graph codes between MemoryInputSDPA and MemoryInput

@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Dec 24, 2024
Merged via the queue into openvinotoolkit:master with commit 416bd98 Dec 24, 2024
180 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants